## DataFrame with 130 rows and 28 columns
## NREADS NALIGNED RALIGN TOTAL_DUP PRIMER INSERT_SZ
## <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
## SRR1275356 10554900 7555880 71.5862 58.4931 0.0217638 208
## SRR1274090 196162 182494 93.0323 14.5122 0.0366826 247
## SRR1275251 8524470 5858130 68.7213 65.0428 0.0351827 230
## SRR1275287 7229920 5891540 81.4884 49.7609 0.0208685 222
## SRR1275364 5403640 4482910 82.9609 66.5788 0.0298284 228
## ... ... ... ... ... ... ...
## SRR1275259 5949930 4181040 70.2705 52.5975 0.0205253 224
## SRR1275253 10319900 7458710 72.2747 54.9637 0.0205342 207
## SRR1275285 5300270 4276650 80.6873 41.6394 0.0227383 222
## SRR1275366 7701320 6373600 82.7600 68.9431 0.0266275 233
## SRR1275261 13425000 9554960 71.1727 62.0001 0.0200522 241
## INSERT_SZ_STD COMPLEXITY NDUPR PCT_RIBOSOMAL_BASES
## <numeric> <numeric> <numeric> <numeric>
## SRR1275356 63 0.868928 0.343113 2e-06
## SRR1274090 133 0.997655 0.935730 0e+00
## SRR1275251 89 0.789252 0.201082 0e+00
## SRR1275287 78 0.898100 0.538191 0e+00
## SRR1275364 76 0.890693 0.391660 0e+00
## ... ... ... ... ...
## SRR1275259 80 0.898898 0.399189 5e-06
## SRR1275253 62 0.863618 0.344744 0e+00
## SRR1275285 76 0.920068 0.638765 2e-06
## SRR1275366 83 0.860359 0.343122 0e+00
## SRR1275261 105 0.806833 0.234551 0e+00
## PCT_CODING_BASES PCT_UTR_BASES PCT_INTRONIC_BASES
## <numeric> <numeric> <numeric>
## SRR1275356 0.125806 0.180954 0.613229
## SRR1274090 0.309822 0.412917 0.205185
## SRR1275251 0.398461 0.473884 0.039886
## SRR1275287 0.196420 0.227592 0.498944
## SRR1275364 0.138617 0.210406 0.543941
## ... ... ... ...
## SRR1275259 0.261384 0.383665 0.264250
## SRR1275253 0.110732 0.190036 0.606814
## SRR1275285 0.143667 0.231103 0.540070
## SRR1275366 0.215696 0.307817 0.409437
## SRR1275261 0.408881 0.391068 0.147748
## PCT_INTERGENIC_BASES PCT_MRNA_BASES MEDIAN_CV_COVERAGE
## <numeric> <numeric> <numeric>
## SRR1275356 0.080008 0.306760 1.495770
## SRR1274090 0.072076 0.722739 1.007580
## SRR1275251 0.087770 0.872345 1.242990
## SRR1275287 0.077044 0.424013 0.775981
## SRR1275364 0.107035 0.349024 1.441370
## ... ... ... ...
## SRR1275259 0.090696 0.645049 1.101040
## SRR1275253 0.092418 0.300768 1.701690
## SRR1275285 0.085158 0.374770 0.714087
## SRR1275366 0.067050 0.523513 1.251980
## SRR1275261 0.052302 0.799949 0.939066
## MEDIAN_5PRIME_BIAS MEDIAN_3PRIME_BIAS
## <numeric> <numeric>
## SRR1275356 0.000000 0.166122
## SRR1274090 0.181742 0.698991
## SRR1275251 0.000000 0.340046
## SRR1275287 0.010251 0.350915
## SRR1275364 0.000000 0.204074
## ... ... ...
## SRR1275259 0.000000 0.315550
## SRR1275253 0.000000 0.106902
## SRR1275285 0.019578 0.419987
## SRR1275366 0.000000 0.281554
## SRR1275261 0.000292 0.290117
## MEDIAN_5PRIME_TO_3PRIME_BIAS sample_id.x Lane_ID
## <numeric> <character> <character>
## SRR1275356 1.036250 SRX534610 D24VYACXX130502:4
## SRR1274090 0.293510 SRX534823 1
## SRR1275251 0.201518 SRX534623 D24VYACXX130502:4
## SRR1275287 0.292838 SRX534641 D24VYACXX130502:1
## SRR1275364 0.619863 SRX534614 D24VYACXX130502:7
## ... ... ... ...
## SRR1275259 0.350391 SRX534627 D24VYACXX130502:4
## SRR1275253 0.944856 SRX534624 D24VYACXX130502:3
## SRR1275285 0.194939 SRX534640 D24VYACXX130502:1
## SRR1275366 0.388272 SRX534615 D24VYACXX130502:8
## SRR1275261 0.384402 SRX534628 D24VYACXX130502:3
## LibraryName avgLength spots Biological_Condition
## <character> <integer> <integer> <character>
## SRR1275356 GW16_2 202 9818076 GW16
## SRR1274090 NPC_9 60 95454 NPC
## SRR1275251 GW16_8 202 7935952 GW16
## SRR1275287 GW21+3_2 202 6531944 GW21+3
## SRR1275364 GW16_23 202 4919561 GW16
## ... ... ... ... ...
## SRR1275259 GW21_3 202 5528916 GW21
## SRR1275253 GW16_9 202 9562204 GW16
## SRR1275285 GW21+3_16 202 4860721 GW21+3
## SRR1275366 GW16_24 202 7153688 GW16
## SRR1275261 GW21_4 202 12142387 GW21
## Coverage_Type Cluster1 Cluster2
## <character> <factor> <factor>
## SRR1275356 High IIIb III
## SRR1274090 Low 1a I
## SRR1275251 High NA III
## SRR1275287 High 1c I
## SRR1275364 High IIIb III
## ... ... ... ...
## SRR1275259 High NA III
## SRR1275253 High IIIb III
## SRR1275285 High Iva IV
## SRR1275366 High NA III
## SRR1275261 High II II
There is clearly something wrong with this. I am not sure whether it’s because the weights are not correct or if it’s because the average is not correct (I suspect the latter, given that we are estimating a fixed mean in a heterogeneous population).
Here, we will repeat the analysis for high-coverage only and low-coverage only data.